Overview

Dataset statistics

Number of variables19
Number of observations100000
Missing cells95887
Missing cells (%)5.0%
Duplicate rows10215
Duplicate rows (%)10.2%
Total size in memory14.5 MiB
Average record size in memory152.0 B

Variable types

Categorical7
Numeric12

Alerts

Dataset has 10215 (10.2%) duplicate rowsDuplicates
Loan ID has a high cardinality: 81999 distinct values High cardinality
Customer ID has a high cardinality: 81999 distinct values High cardinality
Annual Income is highly correlated with Monthly DebtHigh correlation
Monthly Debt is highly correlated with Current Credit BalanceHigh correlation
Number of Credit Problems is highly correlated with Bankruptcies and 1 other fieldsHigh correlation
Current Credit Balance is highly correlated with Monthly DebtHigh correlation
Maximum Open Credit is highly correlated with Current Credit BalanceHigh correlation
Bankruptcies is highly correlated with Number of Credit ProblemsHigh correlation
Tax Liens is highly correlated with Number of Credit ProblemsHigh correlation
Loan Status is highly correlated with Credit ScoreHigh correlation
Credit Score is highly correlated with Loan StatusHigh correlation
Home Ownership is highly correlated with PurposeHigh correlation
Purpose is highly correlated with Home OwnershipHigh correlation
Credit Score has 19154 (19.2%) missing values Missing
Annual Income has 19154 (19.2%) missing values Missing
Years in current job has 4222 (4.2%) missing values Missing
Months since last delinquent has 53141 (53.1%) missing values Missing
Annual Income is highly skewed (γ1 = 46.88869873) Skewed
Maximum Open Credit is highly skewed (γ1 = 132.6388749) Skewed
Loan ID is uniformly distributed Uniform
Customer ID is uniformly distributed Uniform
Number of Credit Problems has 86035 (86.0%) zeros Zeros
Bankruptcies has 88774 (88.8%) zeros Zeros
Tax Liens has 98062 (98.1%) zeros Zeros

Reproduction

Analysis started2022-12-15 10:05:10.108191
Analysis finished2022-12-15 10:05:28.182106
Duration18.07 seconds
Software versionpandas-profiling v3.3.0
Download configurationconfig.json

Variables

Loan ID
Categorical

HIGH CARDINALITY
UNIFORM

Distinct81999
Distinct (%)82.0%
Missing0
Missing (%)0.0%
Memory size781.4 KiB
14dd8831-6af5-400b-83ec-68e61888a048
 
2
1373bfdf-ae6d-4fa5-b3e9-73ba60b7868e
 
2
111602d9-f958-403d-a9d5-4fba630297eb
 
2
1076b681-b8ff-463f-a7af-576e12edd637
 
2
f7817002-4e22-4462-8abc-402eb4ceaa6c
 
2
Other values (81994)
99990 

Length

Max length36
Median length36
Mean length36
Min length36

Characters and Unicode

Total characters3600000
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique63998 ?
Unique (%)64.0%

Sample

1st row14dd8831-6af5-400b-83ec-68e61888a048
2nd row4771cc26-131a-45db-b5aa-537ea4ba5342
3rd row4eed4e6a-aa2f-4c91-8651-ce984ee8fb26
4th row77598f7b-32e7-4e3b-a6e5-06ba0d98fe8a
5th rowd4062e70-befa-4995-8643-a0de73938182

Common Values

ValueCountFrequency (%)
14dd8831-6af5-400b-83ec-68e61888a0482
 
< 0.1%
1373bfdf-ae6d-4fa5-b3e9-73ba60b7868e2
 
< 0.1%
111602d9-f958-403d-a9d5-4fba630297eb2
 
< 0.1%
1076b681-b8ff-463f-a7af-576e12edd6372
 
< 0.1%
f7817002-4e22-4462-8abc-402eb4ceaa6c2
 
< 0.1%
cbad155d-99e3-4cb5-b8ef-dc8ad694488c2
 
< 0.1%
d9add07d-0988-4f3d-838d-a42a4fbd05ce2
 
< 0.1%
96638bcb-d939-41b0-9bb5-d951d6a4dcc22
 
< 0.1%
52d510dd-c367-4472-b450-125bb66f32882
 
< 0.1%
1c6d8406-97bb-416d-a416-701afb3989832
 
< 0.1%
Other values (81989)99980
> 99.9%

Length

2022-12-15T10:05:28.223085image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
14dd8831-6af5-400b-83ec-68e61888a0482
 
< 0.1%
4795c1ce-0d77-400a-b370-0ffc7b688a0d2
 
< 0.1%
1585509a-1a31-4460-949f-2624ed0541762
 
< 0.1%
4c6d5ca9-b4f7-4724-b60d-4c39919f33dd2
 
< 0.1%
070efe1e-7911-4927-b466-e7541fc557a32
 
< 0.1%
6eb14cde-2de6-48df-a6dc-72527642d1052
 
< 0.1%
e8f5ca19-f3f7-487f-aba7-26c0b600a7a32
 
< 0.1%
c9488ac4-87ed-4da7-bdda-194e7d8288942
 
< 0.1%
9fe9322c-fb7b-4777-a763-a6ec220e4d532
 
< 0.1%
95533358-c862-4fdb-9ca9-0b21242092472
 
< 0.1%
Other values (81989)99980
> 99.9%

Most occurring characters

ValueCountFrequency (%)
-400000
 
11.1%
4287551
 
8.0%
a212644
 
5.9%
9212346
 
5.9%
b211946
 
5.9%
8211864
 
5.9%
0188172
 
5.2%
2188148
 
5.2%
c188078
 
5.2%
7187813
 
5.2%
Other values (7)1311438
36.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number2025282
56.3%
Lowercase Letter1174718
32.6%
Dash Punctuation400000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4287551
14.2%
9212346
10.5%
8211864
10.5%
0188172
9.3%
2188148
9.3%
7187813
9.3%
1187606
9.3%
5187480
9.3%
3187225
9.2%
6187077
9.2%
Lowercase Letter
ValueCountFrequency (%)
a212644
18.1%
b211946
18.0%
c188078
16.0%
e187476
16.0%
d187374
16.0%
f187200
15.9%
Dash Punctuation
ValueCountFrequency (%)
-400000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common2425282
67.4%
Latin1174718
32.6%

Most frequent character per script

Common
ValueCountFrequency (%)
-400000
16.5%
4287551
11.9%
9212346
8.8%
8211864
8.7%
0188172
7.8%
2188148
7.8%
7187813
7.7%
1187606
7.7%
5187480
7.7%
3187225
7.7%
Latin
ValueCountFrequency (%)
a212644
18.1%
b211946
18.0%
c188078
16.0%
e187476
16.0%
d187374
16.0%
f187200
15.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII3600000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
-400000
 
11.1%
4287551
 
8.0%
a212644
 
5.9%
9212346
 
5.9%
b211946
 
5.9%
8211864
 
5.9%
0188172
 
5.2%
2188148
 
5.2%
c188078
 
5.2%
7187813
 
5.2%
Other values (7)1311438
36.4%

Customer ID
Categorical

HIGH CARDINALITY
UNIFORM

Distinct81999
Distinct (%)82.0%
Missing0
Missing (%)0.0%
Memory size781.4 KiB
981165ec-3274-42f5-a3b4-d104041a9ca9
 
2
3b5f11f9-0951-43e2-bed5-d24d1e4e8b76
 
2
add9dac6-c1ad-4441-ba96-274fbf493379
 
2
7bc16148-ec0a-4b75-89fc-20ca13350175
 
2
3b561060-1b20-4c36-95c2-9a9b1bf54195
 
2
Other values (81994)
99990 

Length

Max length36
Median length36
Mean length36
Min length36

Characters and Unicode

Total characters3600000
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique63998 ?
Unique (%)64.0%

Sample

1st row981165ec-3274-42f5-a3b4-d104041a9ca9
2nd row2de017a3-2e01-49cb-a581-08169e83be29
3rd row5efb2b2b-bf11-4dfd-a572-3761a2694725
4th rowe777faab-98ae-45af-9a86-7ce5b33b1011
5th row81536ad9-5ccf-4eb8-befb-47a4d608658e

Common Values

ValueCountFrequency (%)
981165ec-3274-42f5-a3b4-d104041a9ca92
 
< 0.1%
3b5f11f9-0951-43e2-bed5-d24d1e4e8b762
 
< 0.1%
add9dac6-c1ad-4441-ba96-274fbf4933792
 
< 0.1%
7bc16148-ec0a-4b75-89fc-20ca133501752
 
< 0.1%
3b561060-1b20-4c36-95c2-9a9b1bf541952
 
< 0.1%
f4eed63c-0c95-4ffa-bf31-585c73ff3f912
 
< 0.1%
a42ab3d2-db8b-4d20-a357-f26c7a6f10b52
 
< 0.1%
5c5731e6-64da-4e40-a0d2-f858957263ae2
 
< 0.1%
9606e990-2ab6-4e5d-ba59-a4cf61254b482
 
< 0.1%
6a03d0e7-5d8d-4399-a944-236f40224d312
 
< 0.1%
Other values (81989)99980
> 99.9%

Length

2022-12-15T10:05:28.279276image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
981165ec-3274-42f5-a3b4-d104041a9ca92
 
< 0.1%
02a9a103-ce6d-4d24-ad1d-1e68074a0e752
 
< 0.1%
6d13a492-d15a-4f16-9785-4751d947bf642
 
< 0.1%
89a9a394-95eb-46e0-aabc-d1dd39dbbc252
 
< 0.1%
9cf7c200-a608-4404-883a-17964f19e3c32
 
< 0.1%
260c52a3-46b8-4cdc-89e7-3529605573442
 
< 0.1%
69f7f1af-4464-4d8b-8c41-17d15b7b5e4d2
 
< 0.1%
123b0aba-90c7-4d67-9d17-d8566fd91e742
 
< 0.1%
f1cb42a4-a25d-4cc6-8cdf-94c147ec99062
 
< 0.1%
eeb8789e-5124-4854-9aab-7aa178d0f82e2
 
< 0.1%
Other values (81989)99980
> 99.9%

Most occurring characters

ValueCountFrequency (%)
-400000
 
11.1%
4287889
 
8.0%
a212705
 
5.9%
8212170
 
5.9%
b211850
 
5.9%
9211791
 
5.9%
c188049
 
5.2%
e188024
 
5.2%
7187963
 
5.2%
5187790
 
5.2%
Other values (7)1311769
36.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number2024943
56.2%
Lowercase Letter1175057
32.6%
Dash Punctuation400000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4287889
14.2%
8212170
10.5%
9211791
10.5%
7187963
9.3%
5187790
9.3%
6187624
9.3%
3187601
9.3%
1187434
9.3%
0187400
9.3%
2187281
9.2%
Lowercase Letter
ValueCountFrequency (%)
a212705
18.1%
b211850
18.0%
c188049
16.0%
e188024
16.0%
d187287
15.9%
f187142
15.9%
Dash Punctuation
ValueCountFrequency (%)
-400000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common2424943
67.4%
Latin1175057
32.6%

Most frequent character per script

Common
ValueCountFrequency (%)
-400000
16.5%
4287889
11.9%
8212170
8.7%
9211791
8.7%
7187963
7.8%
5187790
7.7%
6187624
7.7%
3187601
7.7%
1187434
7.7%
0187400
7.7%
Latin
ValueCountFrequency (%)
a212705
18.1%
b211850
18.0%
c188049
16.0%
e188024
16.0%
d187287
15.9%
f187142
15.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII3600000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
-400000
 
11.1%
4287889
 
8.0%
a212705
 
5.9%
8212170
 
5.9%
b211850
 
5.9%
9211791
 
5.9%
c188049
 
5.2%
e188024
 
5.2%
7187963
 
5.2%
5187790
 
5.2%
Other values (7)1311769
36.4%

Loan Status
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.4 KiB
Fully Paid
77361 
Charged Off
22639 

Length

Max length11
Median length10
Mean length10.22639
Min length10

Characters and Unicode

Total characters1022639
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFully Paid
2nd rowFully Paid
3rd rowFully Paid
4th rowFully Paid
5th rowFully Paid

Common Values

ValueCountFrequency (%)
Fully Paid77361
77.4%
Charged Off22639
 
22.6%

Length

2022-12-15T10:05:28.324541image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-12-15T10:05:28.378729image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
fully77361
38.7%
paid77361
38.7%
charged22639
 
11.3%
off22639
 
11.3%

Most occurring characters

ValueCountFrequency (%)
l154722
15.1%
100000
9.8%
a100000
9.8%
d100000
9.8%
F77361
7.6%
u77361
7.6%
y77361
7.6%
P77361
7.6%
i77361
7.6%
f45278
 
4.4%
Other values (6)135834
13.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter722639
70.7%
Uppercase Letter200000
 
19.6%
Space Separator100000
 
9.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l154722
21.4%
a100000
13.8%
d100000
13.8%
u77361
10.7%
y77361
10.7%
i77361
10.7%
f45278
 
6.3%
h22639
 
3.1%
r22639
 
3.1%
g22639
 
3.1%
Uppercase Letter
ValueCountFrequency (%)
F77361
38.7%
P77361
38.7%
C22639
 
11.3%
O22639
 
11.3%
Space Separator
ValueCountFrequency (%)
100000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin922639
90.2%
Common100000
 
9.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
l154722
16.8%
a100000
10.8%
d100000
10.8%
F77361
8.4%
u77361
8.4%
y77361
8.4%
P77361
8.4%
i77361
8.4%
f45278
 
4.9%
C22639
 
2.5%
Other values (5)113195
12.3%
Common
ValueCountFrequency (%)
100000
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1022639
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l154722
15.1%
100000
9.8%
a100000
9.8%
d100000
9.8%
F77361
7.6%
u77361
7.6%
y77361
7.6%
P77361
7.6%
i77361
7.6%
f45278
 
4.4%
Other values (6)135834
13.3%

Current Loan Amount
Real number (ℝ≥0)

Distinct22004
Distinct (%)22.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11760447.39
Minimum10802
Maximum99999999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size781.4 KiB
2022-12-15T10:05:28.429675image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum10802
5-th percentile76054
Q1179652
median312246
Q3524942
95-th percentile99999999
Maximum99999999
Range99989197
Interquartile range (IQR)345290

Descriptive statistics

Standard deviation31783942.55
Coefficient of variation (CV)2.702613387
Kurtosis3.837366224
Mean11760447.39
Median Absolute Deviation (MAD)147664
Skewness2.415986608
Sum1.176044739 × 1012
Variance1.010219004 × 1015
MonotonicityNot monotonic
2022-12-15T10:05:28.489706image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9999999911484
 
11.5%
22310227
 
< 0.1%
22332227
 
< 0.1%
21619427
 
< 0.1%
22365227
 
< 0.1%
10896626
 
< 0.1%
22259626
 
< 0.1%
21496225
 
< 0.1%
21681025
 
< 0.1%
17982825
 
< 0.1%
Other values (21994)88281
88.3%
ValueCountFrequency (%)
108021
 
< 0.1%
112421
 
< 0.1%
154222
 
< 0.1%
210981
 
< 0.1%
214503
 
< 0.1%
214729
< 0.1%
214943
 
< 0.1%
215166
< 0.1%
215387
< 0.1%
215604
< 0.1%
ValueCountFrequency (%)
9999999911484
11.5%
7892503
 
< 0.1%
7891846
 
< 0.1%
78909616
 
< 0.1%
7890309
 
< 0.1%
7889429
 
< 0.1%
7888764
 
< 0.1%
7887885
 
< 0.1%
7887222
 
< 0.1%
7886348
 
< 0.1%

Term
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.4 KiB
Short Term
72208 
Long Term
27792 

Length

Max length10
Median length10
Mean length9.72208
Min length9

Characters and Unicode

Total characters972208
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowShort Term
2nd rowShort Term
3rd rowShort Term
4th rowLong Term
5th rowShort Term

Common Values

ValueCountFrequency (%)
Short Term72208
72.2%
Long Term27792
 
27.8%

Length

2022-12-15T10:05:28.542793image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-12-15T10:05:28.587807image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
term100000
50.0%
short72208
36.1%
long27792
 
13.9%

Most occurring characters

ValueCountFrequency (%)
r172208
17.7%
o100000
10.3%
100000
10.3%
T100000
10.3%
e100000
10.3%
m100000
10.3%
S72208
7.4%
h72208
7.4%
t72208
7.4%
L27792
 
2.9%
Other values (2)55584
 
5.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter672208
69.1%
Uppercase Letter200000
 
20.6%
Space Separator100000
 
10.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r172208
25.6%
o100000
14.9%
e100000
14.9%
m100000
14.9%
h72208
10.7%
t72208
10.7%
n27792
 
4.1%
g27792
 
4.1%
Uppercase Letter
ValueCountFrequency (%)
T100000
50.0%
S72208
36.1%
L27792
 
13.9%
Space Separator
ValueCountFrequency (%)
100000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin872208
89.7%
Common100000
 
10.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
r172208
19.7%
o100000
11.5%
T100000
11.5%
e100000
11.5%
m100000
11.5%
S72208
8.3%
h72208
8.3%
t72208
8.3%
L27792
 
3.2%
n27792
 
3.2%
Common
ValueCountFrequency (%)
100000
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII972208
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r172208
17.7%
o100000
10.3%
100000
10.3%
T100000
10.3%
e100000
10.3%
m100000
10.3%
S72208
7.4%
h72208
7.4%
t72208
7.4%
L27792
 
2.9%
Other values (2)55584
 
5.7%

Credit Score
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct324
Distinct (%)0.4%
Missing19154
Missing (%)19.2%
Infinite0
Infinite (%)0.0%
Mean1076.456089
Minimum585
Maximum7510
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size781.4 KiB
2022-12-15T10:05:28.633493image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum585
5-th percentile661
Q1705
median724
Q3741
95-th percentile6690
Maximum7510
Range6925
Interquartile range (IQR)36

Descriptive statistics

Standard deviation1475.403791
Coefficient of variation (CV)1.370612147
Kurtosis12.97183192
Mean1076.456089
Median Absolute Deviation (MAD)17
Skewness3.86322527
Sum87027169
Variance2176816.348
MonotonicityNot monotonic
2022-12-15T10:05:28.690595image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7471825
 
1.8%
7401746
 
1.7%
7461742
 
1.7%
7411732
 
1.7%
7421723
 
1.7%
7391624
 
1.6%
7451612
 
1.6%
7481598
 
1.6%
7431555
 
1.6%
7251548
 
1.5%
Other values (314)64141
64.1%
(Missing)19154
 
19.2%
ValueCountFrequency (%)
58512
< 0.1%
5867
 
< 0.1%
58711
< 0.1%
58820
< 0.1%
5896
 
< 0.1%
5908
 
< 0.1%
5919
< 0.1%
5924
 
< 0.1%
5937
 
< 0.1%
59410
< 0.1%
ValueCountFrequency (%)
75109
 
< 0.1%
750024
 
< 0.1%
749023
 
< 0.1%
748043
< 0.1%
747051
0.1%
746076
0.1%
745055
0.1%
744058
0.1%
743070
0.1%
742084
0.1%

Annual Income
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
SKEWED

Distinct36174
Distinct (%)44.7%
Missing19154
Missing (%)19.2%
Infinite0
Infinite (%)0.0%
Mean1378276.56
Minimum76627
Maximum165557393
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size781.4 KiB
2022-12-15T10:05:28.752589image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum76627
5-th percentile519863.75
Q1848844
median1174162
Q31650663
95-th percentile2810579.75
Maximum165557393
Range165480766
Interquartile range (IQR)801819

Descriptive statistics

Standard deviation1081360.196
Coefficient of variation (CV)0.7845741756
Kurtosis6624.167709
Mean1378276.56
Median Absolute Deviation (MAD)380703
Skewness46.88869873
Sum1.114281468 × 1011
Variance1.169339873 × 1012
MonotonicityNot monotonic
2022-12-15T10:05:28.812340image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
116257222
 
< 0.1%
97337019
 
< 0.1%
96947518
 
< 0.1%
114000018
 
< 0.1%
114376217
 
< 0.1%
94990517
 
< 0.1%
111264017
 
< 0.1%
95323017
 
< 0.1%
132029117
 
< 0.1%
116622017
 
< 0.1%
Other values (36164)80667
80.7%
(Missing)19154
 
19.2%
ValueCountFrequency (%)
766271
< 0.1%
810921
< 0.1%
948671
< 0.1%
970331
< 0.1%
1065331
< 0.1%
1112452
< 0.1%
1301501
< 0.1%
1348812
< 0.1%
1350712
< 0.1%
1442672
< 0.1%
ValueCountFrequency (%)
1655573931
< 0.1%
364754401
< 0.1%
308389951
< 0.1%
280953001
< 0.1%
241615401
< 0.1%
239803751
< 0.1%
224488802
< 0.1%
190190001
< 0.1%
187682002
< 0.1%
187439371
< 0.1%

Years in current job
Categorical

MISSING

Distinct11
Distinct (%)< 0.1%
Missing4222
Missing (%)4.2%
Memory size781.4 KiB
10+ years
31121 
2 years
9134 
3 years
8169 
< 1 year
8164 
5 years
6787 
Other values (6)
32403 

Length

Max length9
Median length7
Mean length7.667648103
Min length6

Characters and Unicode

Total characters734392
Distinct characters18
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row8 years
2nd row10+ years
3rd row8 years
4th row3 years
5th row5 years

Common Values

ValueCountFrequency (%)
10+ years31121
31.1%
2 years9134
 
9.1%
3 years8169
 
8.2%
< 1 year8164
 
8.2%
5 years6787
 
6.8%
1 year6460
 
6.5%
4 years6143
 
6.1%
6 years5686
 
5.7%
7 years5577
 
5.6%
8 years4582
 
4.6%
(Missing)4222
 
4.2%

Length

2022-12-15T10:05:28.869027image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
years81154
40.6%
1031121
 
15.6%
114624
 
7.3%
year14624
 
7.3%
29134
 
4.6%
38169
 
4.1%
8164
 
4.1%
56787
 
3.4%
46143
 
3.1%
65686
 
2.8%
Other values (3)14114
 
7.1%

Most occurring characters

ValueCountFrequency (%)
103942
14.2%
y95778
13.0%
e95778
13.0%
a95778
13.0%
r95778
13.0%
s81154
11.1%
145745
6.2%
031121
 
4.2%
+31121
 
4.2%
29134
 
1.2%
Other values (8)49063
6.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter464266
63.2%
Decimal Number126899
 
17.3%
Space Separator103942
 
14.2%
Math Symbol39285
 
5.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
145745
36.0%
031121
24.5%
29134
 
7.2%
38169
 
6.4%
56787
 
5.3%
46143
 
4.8%
65686
 
4.5%
75577
 
4.4%
84582
 
3.6%
93955
 
3.1%
Lowercase Letter
ValueCountFrequency (%)
y95778
20.6%
e95778
20.6%
a95778
20.6%
r95778
20.6%
s81154
17.5%
Math Symbol
ValueCountFrequency (%)
+31121
79.2%
<8164
 
20.8%
Space Separator
ValueCountFrequency (%)
103942
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin464266
63.2%
Common270126
36.8%

Most frequent character per script

Common
ValueCountFrequency (%)
103942
38.5%
145745
16.9%
031121
 
11.5%
+31121
 
11.5%
29134
 
3.4%
38169
 
3.0%
<8164
 
3.0%
56787
 
2.5%
46143
 
2.3%
65686
 
2.1%
Other values (3)14114
 
5.2%
Latin
ValueCountFrequency (%)
y95778
20.6%
e95778
20.6%
a95778
20.6%
r95778
20.6%
s81154
17.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII734392
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
103942
14.2%
y95778
13.0%
e95778
13.0%
a95778
13.0%
r95778
13.0%
s81154
11.1%
145745
6.2%
031121
 
4.2%
+31121
 
4.2%
29134
 
1.2%
Other values (8)49063
6.7%

Home Ownership
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.4 KiB
Home Mortgage
48410 
Rent
42194 
Own Home
9182 
HaveMortgage
 
214

Length

Max length13
Median length12
Mean length8.7413
Min length4

Characters and Unicode

Total characters874130
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHome Mortgage
2nd rowHome Mortgage
3rd rowOwn Home
4th rowOwn Home
5th rowRent

Common Values

ValueCountFrequency (%)
Home Mortgage48410
48.4%
Rent42194
42.2%
Own Home9182
 
9.2%
HaveMortgage214
 
0.2%

Length

2022-12-15T10:05:28.914321image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-12-15T10:05:28.960242image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
home57592
36.5%
mortgage48410
30.7%
rent42194
26.8%
own9182
 
5.8%
havemortgage214
 
0.1%

Most occurring characters

ValueCountFrequency (%)
e148624
17.0%
o106216
12.2%
g97248
11.1%
t90818
10.4%
H57806
 
6.6%
m57592
 
6.6%
57592
 
6.6%
n51376
 
5.9%
a48838
 
5.6%
M48624
 
5.6%
Other values (5)109396
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter658732
75.4%
Uppercase Letter157806
 
18.1%
Space Separator57592
 
6.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e148624
22.6%
o106216
16.1%
g97248
14.8%
t90818
13.8%
m57592
 
8.7%
n51376
 
7.8%
a48838
 
7.4%
r48624
 
7.4%
w9182
 
1.4%
v214
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
H57806
36.6%
M48624
30.8%
R42194
26.7%
O9182
 
5.8%
Space Separator
ValueCountFrequency (%)
57592
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin816538
93.4%
Common57592
 
6.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e148624
18.2%
o106216
13.0%
g97248
11.9%
t90818
11.1%
H57806
 
7.1%
m57592
 
7.1%
n51376
 
6.3%
a48838
 
6.0%
M48624
 
6.0%
r48624
 
6.0%
Other values (4)60772
7.4%
Common
ValueCountFrequency (%)
57592
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII874130
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e148624
17.0%
o106216
12.2%
g97248
11.1%
t90818
10.4%
H57806
 
6.6%
m57592
 
6.6%
57592
 
6.6%
n51376
 
5.9%
a48838
 
5.6%
M48624
 
5.6%
Other values (5)109396
12.5%

Purpose
Categorical

HIGH CORRELATION

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.4 KiB
Debt Consolidation
78552 
other
 
6037
Home Improvements
 
5839
Other
 
3250
Business Loan
 
1569
Other values (11)
 
4753

Length

Max length20
Median length18
Mean length16.32015
Min length5

Characters and Unicode

Total characters1632015
Distinct characters35
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHome Improvements
2nd rowDebt Consolidation
3rd rowDebt Consolidation
4th rowDebt Consolidation
5th rowDebt Consolidation

Common Values

ValueCountFrequency (%)
Debt Consolidation78552
78.6%
other6037
 
6.0%
Home Improvements5839
 
5.8%
Other3250
 
3.2%
Business Loan1569
 
1.6%
Buy a Car1265
 
1.3%
Medical Bills1127
 
1.1%
Buy House678
 
0.7%
Take a Trip573
 
0.6%
major_purchase352
 
0.4%
Other values (6)758
 
0.8%

Length

2022-12-15T10:05:29.010089image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
debt78552
41.0%
consolidation78552
41.0%
other9287
 
4.8%
home5839
 
3.0%
improvements5839
 
3.0%
buy1943
 
1.0%
a1838
 
1.0%
business1569
 
0.8%
loan1569
 
0.8%
car1265
 
0.7%
Other values (13)5287
 
2.8%

Most occurring characters

ValueCountFrequency (%)
o256320
15.7%
t172430
10.6%
n166948
10.2%
i162248
9.9%
e110301
 
6.8%
s92585
 
5.7%
91540
 
5.6%
a86321
 
5.3%
l82608
 
5.1%
d80008
 
4.9%
Other values (25)330706
20.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1357176
83.2%
Uppercase Letter182654
 
11.2%
Space Separator91540
 
5.6%
Connector Punctuation645
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o256320
18.9%
t172430
12.7%
n166948
12.3%
i162248
12.0%
e110301
8.1%
s92585
 
6.8%
a86321
 
6.4%
l82608
 
6.1%
d80008
 
5.9%
b78845
 
5.8%
Other values (13)68562
 
5.1%
Uppercase Letter
ValueCountFrequency (%)
C79817
43.7%
D78552
43.0%
H6517
 
3.6%
I5839
 
3.2%
B4639
 
2.5%
O3250
 
1.8%
L1569
 
0.9%
T1146
 
0.6%
M1127
 
0.6%
E198
 
0.1%
Space Separator
ValueCountFrequency (%)
91540
100.0%
Connector Punctuation
ValueCountFrequency (%)
_645
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1539830
94.4%
Common92185
 
5.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
o256320
16.6%
t172430
11.2%
n166948
10.8%
i162248
10.5%
e110301
7.2%
s92585
 
6.0%
a86321
 
5.6%
l82608
 
5.4%
d80008
 
5.2%
C79817
 
5.2%
Other values (23)250244
16.3%
Common
ValueCountFrequency (%)
91540
99.3%
_645
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII1632015
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o256320
15.7%
t172430
10.6%
n166948
10.2%
i162248
9.9%
e110301
 
6.8%
s92585
 
5.7%
91540
 
5.6%
a86321
 
5.3%
l82608
 
5.1%
d80008
 
4.9%
Other values (25)330706
20.3%

Monthly Debt
Real number (ℝ≥0)

HIGH CORRELATION

Distinct65765
Distinct (%)65.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18472.41234
Minimum0
Maximum435843.28
Zeros74
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size781.4 KiB
2022-12-15T10:05:29.068604image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3714.823
Q110214.1625
median16220.3
Q324012.0575
95-th percentile40477.6095
Maximum435843.28
Range435843.28
Interquartile range (IQR)13797.895

Descriptive statistics

Standard deviation12174.99261
Coefficient of variation (CV)0.659090561
Kurtosis22.19305843
Mean18472.41234
Median Absolute Deviation (MAD)6724.48
Skewness2.213941648
Sum1847241234
Variance148230445
MonotonicityNot monotonic
2022-12-15T10:05:29.126739image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
074
 
0.1%
159039
 
< 0.1%
11162.889
 
< 0.1%
13033.438
 
< 0.1%
12656.478
 
< 0.1%
10647.988
 
< 0.1%
14726.528
 
< 0.1%
13359.858
 
< 0.1%
12967.58
 
< 0.1%
15343.078
 
< 0.1%
Other values (65755)99852
99.9%
ValueCountFrequency (%)
074
0.1%
7.412
 
< 0.1%
12.921
 
< 0.1%
17.11
 
< 0.1%
19.571
 
< 0.1%
20.711
 
< 0.1%
22.231
 
< 0.1%
28.51
 
< 0.1%
34.962
 
< 0.1%
41.991
 
< 0.1%
ValueCountFrequency (%)
435843.281
< 0.1%
229057.921
< 0.1%
205801.351
< 0.1%
173265.562
< 0.1%
172156.151
< 0.1%
165810.531
< 0.1%
165437.182
< 0.1%
152512.241
< 0.1%
152331.931
< 0.1%
147152.531
< 0.1%

Years of Credit History
Real number (ℝ≥0)

Distinct506
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.199141
Minimum3.6
Maximum70.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size781.4 KiB
2022-12-15T10:05:29.185404image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum3.6
5-th percentile9
Q113.5
median16.9
Q321.7
95-th percentile31.7
Maximum70.5
Range66.9
Interquartile range (IQR)8.2

Descriptive statistics

Standard deviation7.01532365
Coefficient of variation (CV)0.385475537
Kurtosis1.740701905
Mean18.199141
Median Absolute Deviation (MAD)4
Skewness1.071550923
Sum1819914.1
Variance49.21476591
MonotonicityNot monotonic
2022-12-15T10:05:29.241380image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
161340
 
1.3%
151305
 
1.3%
171219
 
1.2%
16.51176
 
1.2%
141151
 
1.2%
15.41065
 
1.1%
17.51025
 
1.0%
131021
 
1.0%
14.5980
 
1.0%
18967
 
1.0%
Other values (496)88751
88.8%
ValueCountFrequency (%)
3.61
 
< 0.1%
3.72
 
< 0.1%
3.83
 
< 0.1%
3.94
 
< 0.1%
46
 
< 0.1%
4.18
 
< 0.1%
4.220
< 0.1%
4.39
< 0.1%
4.418
< 0.1%
4.517
< 0.1%
ValueCountFrequency (%)
70.51
< 0.1%
652
< 0.1%
60.52
< 0.1%
59.91
< 0.1%
59.71
< 0.1%
59.52
< 0.1%
581
< 0.1%
57.72
< 0.1%
57.51
< 0.1%
571
< 0.1%

Months since last delinquent
Real number (ℝ≥0)

MISSING

Distinct116
Distinct (%)0.2%
Missing53141
Missing (%)53.1%
Infinite0
Infinite (%)0.0%
Mean34.90132098
Minimum0
Maximum176
Zeros216
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size781.4 KiB
2022-12-15T10:05:29.304874image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5
Q116
median32
Q351
95-th percentile75
Maximum176
Range176
Interquartile range (IQR)35

Descriptive statistics

Standard deviation21.99782878
Coefficient of variation (CV)0.6302864236
Kurtosis-0.7457896078
Mean34.90132098
Median Absolute Deviation (MAD)17
Skewness0.4343615565
Sum1635441
Variance483.9044711
MonotonicityNot monotonic
2022-12-15T10:05:29.363629image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13922
 
0.9%
12902
 
0.9%
14877
 
0.9%
15865
 
0.9%
10861
 
0.9%
8856
 
0.9%
9849
 
0.8%
18847
 
0.8%
16837
 
0.8%
6836
 
0.8%
Other values (106)38207
38.2%
(Missing)53141
53.1%
ValueCountFrequency (%)
0216
 
0.2%
1289
 
0.3%
2418
0.4%
3445
0.4%
4513
0.5%
5703
0.7%
6836
0.8%
7825
0.8%
8856
0.9%
9849
0.8%
ValueCountFrequency (%)
1762
< 0.1%
1521
< 0.1%
1481
< 0.1%
1431
< 0.1%
1411
< 0.1%
1391
< 0.1%
1311
< 0.1%
1301
< 0.1%
1291
< 0.1%
1202
< 0.1%

Number of Open Accounts
Real number (ℝ≥0)

Distinct51
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.12853
Minimum0
Maximum76
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size781.4 KiB
2022-12-15T10:05:29.425118image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5
Q18
median10
Q314
95-th percentile20
Maximum76
Range76
Interquartile range (IQR)6

Descriptive statistics

Standard deviation5.00987036
Coefficient of variation (CV)0.4501825812
Kurtosis3.042766891
Mean11.12853
Median Absolute Deviation (MAD)3
Skewness1.17920132
Sum1112853
Variance25.09880103
MonotonicityNot monotonic
2022-12-15T10:05:29.483733image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
99360
 
9.4%
109012
 
9.0%
88792
 
8.8%
118601
 
8.6%
78090
 
8.1%
127461
 
7.5%
66731
 
6.7%
136280
 
6.3%
145194
 
5.2%
54742
 
4.7%
Other values (41)25737
25.7%
ValueCountFrequency (%)
02
 
< 0.1%
125
 
< 0.1%
2448
 
0.4%
31364
 
1.4%
42849
 
2.8%
54742
4.7%
66731
6.7%
78090
8.1%
88792
8.8%
99360
9.4%
ValueCountFrequency (%)
762
 
< 0.1%
562
 
< 0.1%
522
 
< 0.1%
484
 
< 0.1%
473
 
< 0.1%
456
< 0.1%
445
< 0.1%
4310
< 0.1%
425
< 0.1%
417
< 0.1%

Number of Credit Problems
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.16831
Minimum0
Maximum15
Zeros86035
Zeros (%)86.0%
Negative0
Negative (%)0.0%
Memory size781.4 KiB
2022-12-15T10:05:29.759037image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum15
Range15
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4827049554
Coefficient of variation (CV)2.867951728
Kurtosis48.01246512
Mean0.16831
Median Absolute Deviation (MAD)0
Skewness4.823135609
Sum16831
Variance0.2330040739
MonotonicityNot monotonic
2022-12-15T10:05:29.800890image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
086035
86.0%
112077
 
12.1%
21299
 
1.3%
3378
 
0.4%
4125
 
0.1%
549
 
< 0.1%
617
 
< 0.1%
78
 
< 0.1%
84
 
< 0.1%
112
 
< 0.1%
Other values (4)6
 
< 0.1%
ValueCountFrequency (%)
086035
86.0%
112077
 
12.1%
21299
 
1.3%
3378
 
0.4%
4125
 
0.1%
549
 
< 0.1%
617
 
< 0.1%
78
 
< 0.1%
84
 
< 0.1%
92
 
< 0.1%
ValueCountFrequency (%)
151
 
< 0.1%
121
 
< 0.1%
112
 
< 0.1%
102
 
< 0.1%
92
 
< 0.1%
84
 
< 0.1%
78
 
< 0.1%
617
 
< 0.1%
549
 
< 0.1%
4125
0.1%

Current Credit Balance
Real number (ℝ≥0)

HIGH CORRELATION

Distinct32730
Distinct (%)32.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean294637.3824
Minimum0
Maximum32878968
Zeros572
Zeros (%)0.6%
Negative0
Negative (%)0.0%
Memory size781.4 KiB
2022-12-15T10:05:29.855875image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile30400
Q1112670
median209817
Q3367958.75
95-th percentile760741
Maximum32878968
Range32878968
Interquartile range (IQR)255288.75

Descriptive statistics

Standard deviation376170.9347
Coefficient of variation (CV)1.276725077
Kurtosis697.4981871
Mean294637.3824
Median Absolute Deviation (MAD)115007
Skewness14.1544283
Sum2.946373824 × 1010
Variance1.415045721 × 1011
MonotonicityNot monotonic
2022-12-15T10:05:29.913149image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0572
 
0.6%
6769717
 
< 0.1%
17597817
 
< 0.1%
6568317
 
< 0.1%
12401316
 
< 0.1%
24637316
 
< 0.1%
8842616
 
< 0.1%
10030116
 
< 0.1%
14820015
 
< 0.1%
11168215
 
< 0.1%
Other values (32720)99283
99.3%
ValueCountFrequency (%)
0572
0.6%
1912
 
< 0.1%
389
 
< 0.1%
577
 
< 0.1%
764
 
< 0.1%
956
 
< 0.1%
1147
 
< 0.1%
1335
 
< 0.1%
1522
 
< 0.1%
1714
 
< 0.1%
ValueCountFrequency (%)
328789681
< 0.1%
129869562
< 0.1%
127463972
< 0.1%
117964351
< 0.1%
113619241
< 0.1%
91345922
< 0.1%
78889521
< 0.1%
77495871
< 0.1%
76793441
< 0.1%
76667471
< 0.1%

Maximum Open Credit
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct44596
Distinct (%)44.6%
Missing2
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean760798.3817
Minimum0
Maximum1539737892
Zeros681
Zeros (%)0.7%
Negative0
Negative (%)0.0%
Memory size781.4 KiB
2022-12-15T10:05:29.974170image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile109758
Q1273438
median467874
Q3782958
95-th percentile1640226.5
Maximum1539737892
Range1539737892
Interquartile range (IQR)509520

Descriptive statistics

Standard deviation8384503.472
Coefficient of variation (CV)11.02066418
Kurtosis20394.83985
Mean760798.3817
Median Absolute Deviation (MAD)230901
Skewness132.6388749
Sum7.607831658 × 1010
Variance7.029989848 × 1013
MonotonicityNot monotonic
2022-12-15T10:05:30.033893image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0681
 
0.7%
23720413
 
< 0.1%
23641212
 
< 0.1%
15019412
 
< 0.1%
15547412
 
< 0.1%
20165212
 
< 0.1%
10740412
 
< 0.1%
24613612
 
< 0.1%
24296812
 
< 0.1%
19808811
 
< 0.1%
Other values (44586)99209
99.2%
ValueCountFrequency (%)
0681
0.7%
43344
 
< 0.1%
44441
 
< 0.1%
53901
 
< 0.1%
64465
 
< 0.1%
64683
 
< 0.1%
64902
 
< 0.1%
65122
 
< 0.1%
65341
 
< 0.1%
65564
 
< 0.1%
ValueCountFrequency (%)
15397378921
< 0.1%
13047261701
< 0.1%
9803052601
< 0.1%
7982553701
< 0.1%
6324777361
< 0.1%
4893432061
< 0.1%
3800522881
< 0.1%
2674900581
< 0.1%
2655128741
< 0.1%
1922841582
< 0.1%

Bankruptcies
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct8
Distinct (%)< 0.1%
Missing204
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean0.11774019
Minimum0
Maximum7
Zeros88774
Zeros (%)88.8%
Negative0
Negative (%)0.0%
Memory size781.4 KiB
2022-12-15T10:05:30.084865image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum7
Range7
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.3514238182
Coefficient of variation (CV)2.98473969
Kurtosis18.52767946
Mean0.11774019
Median Absolute Deviation (MAD)0
Skewness3.505803676
Sum11750
Variance0.1234987
MonotonicityNot monotonic
2022-12-15T10:05:30.122762image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
088774
88.8%
110475
 
10.5%
2417
 
0.4%
393
 
0.1%
427
 
< 0.1%
57
 
< 0.1%
62
 
< 0.1%
71
 
< 0.1%
(Missing)204
 
0.2%
ValueCountFrequency (%)
088774
88.8%
110475
 
10.5%
2417
 
0.4%
393
 
0.1%
427
 
< 0.1%
57
 
< 0.1%
62
 
< 0.1%
71
 
< 0.1%
ValueCountFrequency (%)
71
 
< 0.1%
62
 
< 0.1%
57
 
< 0.1%
427
 
< 0.1%
393
 
0.1%
2417
 
0.4%
110475
 
10.5%
088774
88.8%

Tax Liens
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct12
Distinct (%)< 0.1%
Missing10
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean0.02931293129
Minimum0
Maximum15
Zeros98062
Zeros (%)98.1%
Negative0
Negative (%)0.0%
Memory size781.4 KiB
2022-12-15T10:05:30.164661image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum15
Range15
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.2581824362
Coefficient of variation (CV)8.8078
Kurtosis402.0666572
Mean0.02931293129
Median Absolute Deviation (MAD)0
Skewness15.50021981
Sum2931
Variance0.06665817038
MonotonicityNot monotonic
2022-12-15T10:05:30.204410image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
098062
98.1%
11343
 
1.3%
2374
 
0.4%
3111
 
0.1%
458
 
0.1%
516
 
< 0.1%
612
 
< 0.1%
77
 
< 0.1%
93
 
< 0.1%
112
 
< 0.1%
Other values (2)2
 
< 0.1%
(Missing)10
 
< 0.1%
ValueCountFrequency (%)
098062
98.1%
11343
 
1.3%
2374
 
0.4%
3111
 
0.1%
458
 
0.1%
516
 
< 0.1%
612
 
< 0.1%
77
 
< 0.1%
93
 
< 0.1%
101
 
< 0.1%
ValueCountFrequency (%)
151
 
< 0.1%
112
 
< 0.1%
101
 
< 0.1%
93
 
< 0.1%
77
 
< 0.1%
612
 
< 0.1%
516
 
< 0.1%
458
 
0.1%
3111
 
0.1%
2374
0.4%

Interactions

2022-12-15T10:05:26.414449image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:17.604627image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:18.394893image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:19.134406image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:19.888234image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:20.822933image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:21.606780image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:22.374406image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:23.249889image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:24.006832image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:24.749363image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:25.516883image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:26.473598image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:17.683439image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:18.463258image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:19.194051image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:19.949078image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:20.885600image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:21.668397image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:22.439308image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:23.310112image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:24.067120image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:24.812290image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:25.575718image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:26.533097image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:17.751506image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:18.523147image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:19.261765image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:20.010492image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:20.948029image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:21.728176image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:22.504463image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:23.370775image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:24.127149image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:24.874135image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:25.634522image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:26.595380image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:17.817349image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:18.584245image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:19.322791image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:20.074520image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:21.014879image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:21.789302image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:22.569468image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:23.432862image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:24.188682image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:24.938801image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:25.703842image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:26.652582image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:17.876177image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:18.642027image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:19.379168image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:20.133016image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:21.075636image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:21.850461image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:22.629558image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:23.499649image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:24.247279image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:24.999999image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:25.765560image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:26.709110image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:17.937096image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:18.698109image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:19.436605image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:20.368112image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:21.134953image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:21.920268image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:22.688521image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:23.557955image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:24.305478image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:25.059374image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:25.825463image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:26.771864image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:18.004123image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:18.759687image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:19.501959image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:20.432640image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:21.201178image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:21.982314image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:22.752787image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:23.622163image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:24.368923image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:25.124489image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:25.892087image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:26.836016image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:18.069659image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:18.821516image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:19.566595image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:20.498161image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:21.266658image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:22.042780image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:22.818518image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:23.686074image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:24.431853image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:25.189862image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:26.087386image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:26.898579image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:18.137105image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:18.883287image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:19.630683image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:20.563048image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:21.332515image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:22.111344image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:22.986265image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:23.748268image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:24.494296image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:25.255113image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:26.154884image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:26.961125image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:18.205159image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:18.944050image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:19.693727image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:20.627908image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:21.399910image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:22.178723image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:23.052708image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:23.812805image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:24.557743image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:25.320515image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:26.222688image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:27.025780image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:18.271587image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:19.007670image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:19.759313image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:20.695661image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:21.469349image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:22.243377image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:23.121412image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:23.880246image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:24.623544image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:25.387080image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:26.289998image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:27.087781image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:18.334812image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:19.074066image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:19.822776image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:20.760376image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:21.539571image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:22.304406image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:23.187341image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:23.944651image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:24.686877image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:25.451445image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-15T10:05:26.352282image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-12-15T10:05:30.252445image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-12-15T10:05:30.336540image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-12-15T10:05:30.418847image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-12-15T10:05:30.496142image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-12-15T10:05:30.555696image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-12-15T10:05:27.241005image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-12-15T10:05:27.546099image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-12-15T10:05:27.871918image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-12-15T10:05:28.019931image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

Loan IDCustomer IDLoan StatusCurrent Loan AmountTermCredit ScoreAnnual IncomeYears in current jobHome OwnershipPurposeMonthly DebtYears of Credit HistoryMonths since last delinquentNumber of Open AccountsNumber of Credit ProblemsCurrent Credit BalanceMaximum Open CreditBankruptciesTax Liens
014dd8831-6af5-400b-83ec-68e61888a048981165ec-3274-42f5-a3b4-d104041a9ca9Fully Paid445412Short Term709.01167493.08 yearsHome MortgageHome Improvements5214.7417.2NaN61228190416746.01.00.0
14771cc26-131a-45db-b5aa-537ea4ba53422de017a3-2e01-49cb-a581-08169e83be29Fully Paid262328Short TermNaNNaN10+ yearsHome MortgageDebt Consolidation33295.9821.18.0350229976850784.00.00.0
24eed4e6a-aa2f-4c91-8651-ce984ee8fb265efb2b2b-bf11-4dfd-a572-3761a2694725Fully Paid99999999Short Term741.02231892.08 yearsOwn HomeDebt Consolidation29200.5314.929.0181297996750090.00.00.0
377598f7b-32e7-4e3b-a6e5-06ba0d98fe8ae777faab-98ae-45af-9a86-7ce5b33b1011Fully Paid347666Long Term721.0806949.03 yearsOwn HomeDebt Consolidation8741.9012.0NaN90256329386958.00.00.0
4d4062e70-befa-4995-8643-a0de7393818281536ad9-5ccf-4eb8-befb-47a4d608658eFully Paid176220Short TermNaNNaN5 yearsRentDebt Consolidation20639.706.1NaN150253460427174.00.00.0
589d8cb0c-e5c2-4f54-b056-48a645c543dd4ffe99d3-7f2a-44db-afc1-40943f1f9750Charged Off206602Short Term7290.0896857.010+ yearsHome MortgageDebt Consolidation16367.7417.3NaN60215308272448.00.00.0
6273581de-85d8-4332-81a5-19b04ce6866690a75dde-34d5-419c-90dc-1e58b04b3e35Fully Paid217646Short Term730.01184194.0< 1 yearHome MortgageDebt Consolidation10855.0819.610.0131122170272052.01.00.0
7db0dc6e1-77ee-4826-acca-772f9039e1c7018973c9-e316-4956-b363-67e134fb0931Charged Off648714Long TermNaNNaN< 1 yearHome MortgageBuy House14806.138.28.0150193306864204.00.00.0
88af915d9-9e91-44a0-b5a2-564a45c12089af534dea-d27e-4fd6-9de8-efaa52a78ec0Fully Paid548746Short Term678.02559110.02 yearsRentDebt Consolidation18660.2822.633.040437171555038.00.00.0
90b1c4e3d-bd97-45ce-9622-22732fcdc9a0235c4a43-dadf-483d-aa44-9d6d77ae4583Fully Paid215952Short Term739.01454735.0< 1 yearRentDebt Consolidation39277.7513.9NaN2006695601021460.00.00.0

Last rows

Loan IDCustomer IDLoan StatusCurrent Loan AmountTermCredit ScoreAnnual IncomeYears in current jobHome OwnershipPurposeMonthly DebtYears of Credit HistoryMonths since last delinquentNumber of Open AccountsNumber of Credit ProblemsCurrent Credit BalanceMaximum Open CreditBankruptciesTax Liens
99990686017b3-dc24-4f8a-af92-0bd077452d3d1a583add-21ba-410f-9c42-757c4ed19322Fully Paid99999999Short Term742.01190046.0< 1 yearRentother11969.8120.116.09037392134442.00.00.0
99991326d0f2b-015f-480e-90e9-9c0d7d307196ed9a397b-8a72-45c2-92de-b91f990a623dFully Paid244266Short Term714.01619047.010+ yearsRentDebt Consolidation4290.3921.4NaN51132012242660.01.00.0
99992c568adaa-16f9-43d3-b522-8532fb57cb16cbb29fd6-e418-4f09-a4bd-4de83428caabFully Paid48796Short TermNaNNaN4 yearsHome Mortgagemajor_purchase8298.638.3NaN9087875239404.00.00.0
9999379b81158-5d55-4766-8ad6-ebcd683f7d59e45e8dc4-05ad-4efe-92cc-784a6d5ef61aFully Paid44484Short Term717.01152426.010+ yearsHome Mortgagesmall_business6280.6421.012.0609619320.00.00.0
999948506a4e9-af7d-47d2-a1bf-7ea2c41858f0be67200e-1ef1-4b63-86a6-2bf27d3c704dFully Paid210584Short Term719.0783389.01 yearHome MortgageOther3727.6117.418.060456259160.00.00.0
999953f94c18c-ba8f-45d0-8610-88a684a410a92da51983-cfef-4b8f-a733-5dfaf69e9281Fully Paid147070Short Term725.0475437.07 yearsOwn Homeother2202.8622.3NaN5047766658548.00.00.0
9999606eba04f-58fc-424a-b666-ed72aa00890077f2252a-b7d1-4b07-a746-1202a8304290Fully Paid99999999Short Term732.01289416.01 yearRentDebt Consolidation13109.059.421.0220153045509234.00.00.0
99997e1cb4050-eff5-4bdb-a1b0-aabd3f7eaac72ced5f10-bd60-4a11-9134-cadce4e7b0a3Fully Paid103136Short Term742.01150545.06 yearsRentDebt Consolidation7315.5718.818.0121109554537548.01.00.0
9999881ab928b-d1a5-4523-9a3c-271ebb01b4fb3e45ffda-99fd-4cfc-b8b8-446f4a505f36Fully Paid530332Short Term746.01717524.09 yearsRentDebt Consolidation9890.0715.0NaN80404225738254.00.00.0
99999c63916c6-6d46-47a9-949a-51d09af4414f1b3014be-5c07-4d41-abe7-44573c375886Fully Paid99999999Short Term743.0935180.0NaNOwn HomeDebt Consolidation9118.1013.0NaN414560091014.01.00.0

Duplicate rows

Most frequently occurring

Loan IDCustomer IDLoan StatusCurrent Loan AmountTermCredit ScoreAnnual IncomeYears in current jobHome OwnershipPurposeMonthly DebtYears of Credit HistoryMonths since last delinquentNumber of Open AccountsNumber of Credit ProblemsCurrent Credit BalanceMaximum Open CreditBankruptciesTax Liens# duplicates
0000bc65a-6a7c-4566-86f3-203b4ec35eca724bddb4-a23c-4759-ba6f-dc79c7dd5334Fully Paid642202Short Term715.01759533.02 yearsRentDebt Consolidation23020.5913.8NaN110445987733546.00.00.02
1000c16df-c24f-41cf-a90e-60301d131bb9b07c4262-70bb-41cc-b28a-d87540577fb1Fully Paid155496Short Term706.0664753.0NaNOwn Homeother8087.9221.3NaN7179382150700.01.00.02
20016d326-7878-46bb-9c18-a75af255d7feeccd2965-56cf-4be0-99b2-893f8a520feaFully Paid88770Short Term700.01671221.02 yearsHome MortgageHome Improvements2562.5310.3NaN3167279124234.01.00.02
30018f629-8cef-48bd-bb93-40179f24256c1e96933c-3a01-46b2-975d-06c5a2b469c3Fully Paid66396Short Term711.0535192.03 yearsRentDebt Consolidation9142.8015.850.080112347307538.00.00.02
4001a84a9-3fd5-4e82-9153-49325b996408b282e6f9-2d09-4988-b579-6d90d104e70dFully Paid180246Long Term658.0858097.02 yearsRentOther10289.8318.458.080288553440220.00.00.02
5001f3ce7-5277-4202-8511-27b464ea640496875077-865f-4998-9e93-59b016165fdfFully Paid65406Short Term705.01883090.03 yearsHome MortgageHome Improvements56021.8823.4NaN110408291527648.00.00.02
6002f3769-888e-4b66-b940-21ec4b7c2c69f49f2a18-deba-45de-8af6-e0174b9b4894Fully Paid142824Short Term724.0683183.04 yearsHome MortgageDebt Consolidation11272.5115.4NaN702732277000.00.00.02
7003665f5-adff-4fbb-903c-14d022fa6a0880c0ee25-ec87-4e1f-aefc-e5bbd679cef1Fully Paid43692Short Term719.0679079.09 yearsHaveMortgageTake a Trip8601.6811.069.05516264174328.01.04.02
8003d60c3-3a4c-4b81-b0f4-3105c34ce2e89d1e355e-f80d-40a8-b582-d391baa0dce9Fully Paid194370Short Term714.0991952.010+ yearsHome Mortgageother11242.1121.032.060168017448272.00.00.02
90046bac8-1053-4385-8ffe-93dcebd8ee7d4c8dd1fd-1831-4d32-a6e4-cbe4702b2020Fully Paid515064Long Term724.01457851.08 yearsHome MortgageHome Improvements6900.4228.549.011052288535854.00.00.02